410 research outputs found

    New Soft Biscuit Wheat for the Northern Region

    Get PDF
    Established and supported under the Australian Government’s Cooperative Research Centre Progra

    Measuring the gap between HMM-based ASR and TTS

    Get PDF
    The EMIME European project is conducting research in the development of technologies for mobile, personalised speech-to-speech translation systems. The hidden Markov model is being used as the underlying technology in both automatic speech recognition (ASR) and text-to-speech synthesis (TTS) components, thus, the investigation of unified statistical modelling approaches has become an implicit goal of our research. As one of the first steps towards this goal, we have been investigating commonalities and differences between HMM-based ASR and TTS. In this paper we present results and analysis of a series of experiments that have been conducted on English ASR and TTS systems, measuring their performance with respect to phone set and lexicon, acoustic feature type and dimensionality and HMM topology. Our results show that, although the fundamental statistical model may be essentially the same, optimal ASR and TTS performance often demands diametrically opposed system designs. This represents a major challenge to be addressed in the investigation of such unified modelling approaches

    Direct optimisation of a multilayer perceptron for the estimation of cepstral mean and variance statistics

    Get PDF
    We propose an alternative means of training a multilayer perceptron for the task of speech activity detection based on a criterion to minimise the error in the estimation of mean and variance statistics for speech cepstrum based features using the Kullback-Leibler divergence. We present our baseline and proposed speech activity detection approaches for multi-channel meeting room recordings and demonstrate the effectiveness of the new criterion by comparing the two approaches when used to carry out cepstrum mean and variance normalisation of features used in our meeting ASR system

    Phonological Knowledge Guided HMM State Mapping for Cross-Lingual Speaker Adaptation

    Get PDF
    Within the HMM state mapping-based cross-lingual speaker adaptation framework, the minimum Kullback-Leibler divergence criterion has been typically employed to measure the similarity of two average voice state distributions from two respective languages for state mapping construction. Considering that this simple criterion doesn't take any language-specific information into account, we propose a data-driven, phonological knowledge guided approach to strengthen the mapping construction -- state distributions from the two languages are clustered according to broad phonetic categories using decision trees and mapping rules are constructed only within each of the clusters. Objective evaluation of our proposed approach demonstrates reduction of mel-cepstral distortion and that mapping rules derived from a single training speaker generalize to other speakers, with subtle improvement being detected during subjective listening tests

    Epidemiology and Impact of Abdominal Oblique Injuries in Major and Minor League Baseball.

    Get PDF
    BACKGROUND: Oblique injuries are known to be a common cause of time out of play for professional baseball players, and prior work has suggested that injury rates may be on the rise in Major League Baseball (MLB). PURPOSE: To better understand the current incidence of oblique injuries, determine their impact based on time out of play, and to identify common injury patterns that may guide future injury prevention programs. STUDY DESIGN: Descriptive epidemiological study. METHODS: Using the MLB Health and Injury Tracking System, all oblique injuries that resulted in time out of play in MLB and Minor League Baseball (MiLB) during the 2011 to 2015 seasons were identified. Player demographics such as age, position/role, and handedness were included. Injury-specific factors analyzed included the following: date of injury, timing during season, days missed, mechanism, side, treatment, and reinjury status. RESULTS: A total of 996 oblique injuries occurred in 259 (26%) MLB and 737 (74%) MiLB players. Although the injury rate was steady in MiLB, the MLB injury rate declined (P = .037). A total of 22,064 days were missed at a mean rate of 4413 days per season and 22.2 days per injury. The majority of these occurred during batting (n = 455, 46%) or pitching (n = 348, 35%), with pitchers losing 5 days more per injury than batters (P \u3c .001). The leading side was injured in 77% of cases and took 5 days longer to recover from than trailing side injuries (P = .009). Seventy-nine (7.9%) players received either a corticosteroid or platelet-rich plasma injection, and the mean recovery time was 11 days longer compared with those who did not receive an injection (P \u3c .001). CONCLUSION: Although the rate of abdominal oblique injuries is on the decline in MLB, this is not the case for MiLB, and these injuries continue to represent a significant source of time out of play in professional baseball. The vast majority of injuries occur on the lead side, and these injuries result in the greatest amount time out of play. The benefit of injections for the treatment of oblique injuries remains unknown

    The 2005 AMI system for the transcription of speech in meetings

    Get PDF
    In this paper we describe the 2005 AMI system for the transcription\ud of speech in meetings used for participation in the 2005 NIST\ud RT evaluations. The system was designed for participation in the speech\ud to text part of the evaluations, in particular for transcription of speech\ud recorded with multiple distant microphones and independent headset\ud microphones. System performance was tested on both conference room\ud and lecture style meetings. Although input sources are processed using\ud different front-ends, the recognition process is based on a unified system\ud architecture. The system operates in multiple passes and makes use\ud of state of the art technologies such as discriminative training, vocal\ud tract length normalisation, heteroscedastic linear discriminant analysis,\ud speaker adaptation with maximum likelihood linear regression and minimum\ud word error rate decoding. In this paper we describe the system performance\ud on the official development and test sets for the NIST RT05s\ud evaluations. The system was jointly developed in less than 10 months\ud by a multi-site team and was shown to achieve very competitive performance

    Tracter: A Lightweight Dataflow Framework

    Get PDF
    Tracter is introduced as a dataflow framework particularly useful for speech recognition. It is designed to work on-line in real-time as well as off-line, and is the feature extraction means for the Juicer transducer based decoder. This paper places Tracter in context amongst the dataflow literature and other commercial and open source packages. Some design aspects and capabilities are discussed. Finally, a fairly large processing graph incorporating voice activity detection and feature extraction is presented as an example of Tracter's capabilites

    A study of phoneme and grapheme based context-dependent ASR systems

    Get PDF
    In this paper we present a study of automatic speech recognition systems using context-dependent phonemes and graphemes as sub-word units based on the conventional HMM/GMM system as well as tandem system. Experimental studies conducted on three different continuous speech recognition tasks show that systems using only context-dependent graphemes can yield competitive performance on small to medium vocabulary tasks when compared to a context-dependent phoneme-based automatic speech recognition system. In particular, we demonstrate the utility of tandem features that use an MLP trained to estimate phoneme posterior probabilities in improving grapheme based recognition system performance by incorporating phonemic knowledge into the system without having to explicitly define a phonetically transcribed lexicon
    • 

    corecore